Overview
Phase 1 establishes the foundation of the EDL Pipeline by fetching the complete market dataset and fundamental metrics. This phase produces the criticalmaster_isin_map.json file that all subsequent phases depend on.
Execution Order
Phase 1 runs these scripts sequentially:Fetch Master Stock List
Script:
fetch_dhan_data.pyFetches all NSE equity stocks in a single API call.Fetch Fundamental Data
Script:
fetch_fundamental_data.pyIterates through each ISIN to fetch quarterly results and financial ratios.Script 1: fetch_dhan_data.py
Purpose
Fetches the complete list of NSE equity stocks (~2,775 symbols) with basic metrics and creates the master ISIN mapping file.API Endpoint
Request Payload
Output Files
| File | Description | Size | Records |
|---|---|---|---|
dhan_data_response.json | Full API response with all stock data | ~3 MB | 2,775 |
master_isin_map.json | Critical: Symbol ↔ ISIN ↔ Sid mapping | ~500 KB | 2,775 |
master_isin_map.json Structure
Dependencies
- Requires: Internet connection, valid API headers
- Depends on: None (foundation script)
Typical Execution Time
~5-10 seconds — Single API call fetching 2,775 stocks
Script 2: fetch_fundamental_data.py
Purpose
Fetches quarterly results, financial ratios, and TTM metrics for each stock using the ISIN list from Phase 1.API Endpoint
Request Payload
Data Fetched
- Quarterly Results: Latest 4 quarters + YoY comparison
- Income Statement: Revenue, Net Profit, OPM, EPS
- Balance Sheet: Total Assets, Liabilities, Equity
- Ratios: ROE, ROCE, Debt/Equity, P/E, P/B
- TTM Metrics: Trailing twelve month calculations
Output Files
| File | Description | Size | Records |
|---|---|---|---|
fundamental_data.json | Complete fundamental dataset | ~35 MB | 2,775 |
Output Structure
Dependencies
- Requires:
master_isin_map.json(from fetch_dhan_data.py) - Timeout: 30 seconds per request
- Threading: 20 concurrent workers
Typical Execution Time
~2-3 minutes — Fetching 2,775 stocks with 20 threads
NSE Listing Dates Download
Purpose
Downloads the official NSE equity listing dates CSV for enrichment in Phase 3.Command
Output Files
| File | Description | Format |
|---|---|---|
nse_equity_list.csv | Symbol → Listing Date mapping | CSV |
CSV Structure
This is a non-critical download. Pipeline continues even if this fails.
Phase 1 Output Summary
Files Produced
Critical Dependencies for Phase 2+
Error Handling
Critical Failure: fetch_dhan_data.py
If this script fails, the pipeline stops immediately:Non-Critical: Other Failures
- fetch_fundamental_data.py fails: Pipeline continues, but fundamental fields will be empty
- NSE CSV download fails: Pipeline continues, listing dates will be missing
Performance Metrics
Total Phase 1 Time
~2-4 minutes for 2,775 stocks (including NSE CSV download)
Bottlenecks
- fetch_fundamental_data.py: API rate limits (mitigated with 20 threads)
- Network latency: Depends on connection speed
Optimization Tips
-
Increase threading (if API allows):
-
Cache master_isin_map.json between runs:
- Skip fundamental refetch for unchanged stocks (requires change detection)
Next Phase
Once Phase 1 completes, the pipeline automatically proceeds to:Phase 2: Data Enrichment
Fetches company filings, announcements, indicators, news, and surveillance data using the master ISIN map.